Call for Workshop Papers

The 1st Workshop on Intelligent Audio-Visual Interaction in Digital Scenarios(IAVIDS 2026)

Motivation

The proliferation of intelligent digital technologies—such as generative artificial intelligence (GenAI), edge computing, and extended reality (XR)—has reshaped media ecosystems, driving the evolution of digital scenarios into immersive, interactive, and context-aware environments. Within these media-centric ecosystems, Intelligent Audio-Visual Interaction (IAVI) serves as the core interface between users and digital media, directly determining the depth of user engagement, the efficiency of media information dissemination, and the innovation potential of media applications.

This interdisciplinary field integrates theoretical frameworks from media computing, signal processing, cognitive science, and human-computer interaction with cutting-edge technologies, addressing the unique demands of media environments. However, critical unresolved challenges persist: semantic alignment of heterogeneous audio-visual media data in dynamic interaction scenarios, low-latency and high-fidelity audio-visual interaction optimization for resource-constrained media terminals, and human-centric adaptation of audio-visual interaction systems to diverse media consumption contexts.

This Workshop aims to establish a high-caliber academic forum dedicated to media-centric intelligent audio-visual interaction research, bringing together researchers, engineers, scholars, and industry practitioners worldwide. We seek to facilitate the exchange of frontier findings, theoretical innovations, and empirical applications focused on audio-visual interaction in media environments, foster cross-disciplinary integration between computing and media science, and advance the theoretical depth and practical maturity of intelligent audio-visual interaction technology in shaping the future of digital media interaction.

Topics
1. Theoretical Foundations & Methodological Innovations for Media-Centric Intelligent Audio-Visual Interaction

1.1 Mathematical modeling and formal frameworks for audio-visual fusion in interactive media contexts

1.2 Semantic alignment and cross-modal knowledge graphs for media content interaction

1.3 Cognitive science-informed theoretical paradigms for user-centric media audio-visual interaction design

1.4 Data-efficient learning methodologies optimized for media interaction scenario constraints (e.g., dynamic content, variable user contexts)

1.5 Ontologies and metadata standards for standardized audio-visual media interaction

2. Core Technologies for Media-Oriented Audio-Visual Interaction Advancement

2.1 Advanced audio signal processing for media interaction (e.g., real-time adaptive spatial audio, emotion-aware speech processing, media-specific noise suppression)

2.2 Intelligent video analysis for interactive media (e.g., gaze-driven content adaptation, gesture-based media control, scene-aware visual feedback)

2.3 GenAI-driven adaptive audio-visual content generation and interaction for personalized media experiences

2.4 Edge-cloud collaborative computing architectures for low-latency media audio-visual interaction systems

2.5 Synchronization and transmission optimization of audio-visual media streams in interactive scenarios

3. Media Environment Application Domains & Empirical Studies

3.1 Audio-visual interaction systems in XR-based immersive media (e.g., interactive virtual concerts, augmented reality news delivery, mixed reality media storytelling)

3.2 Intelligent audio-visual interaction in smart media platforms (e.g., interactive broadcasting, personalized content recommendation engines, cloud-based media production collaboration)

3.3 Audio-visual interaction for social media (e.g., real-time audio-visual enhancement for user-generated content, interactive media communication tools)

3.4 Educational media applications (e.g., immersive audio-visual teaching platforms, interactive media-based tutoring systems)

3.5 Digital cultural media (e.g., interactive audio-visual exhibitions, virtual heritage site media interaction systems)

Submission Guidelines

• The submission and review procedures are fully consistent with those of the main conference. Please select the track “Workshop IAVIDS 2026” during submission.

• The registration and publication procedures for accepted papers also comply entirely with the standards of the main conference proceedings.

Workshop Chairs

Agnes Miaotong Yuan

School of Music and Recording Arts, Communication University of China, Beijing, 100024, China

Email:  ymtaudio@cuc.edu.cn

Christopher Sauder Engeler

Multimedia Technologies, ETH Zürich, 8092 Zürich, Switzerland

Email:  christopher.sauder@id.ethz.ch

Yuan Zhang

Department of Music Artificial Intelligence and Music Information Technology, Central Conservatory of Music, 100031, Beijing, China

Email:   dazhangyu40@hotmail.com